The value of statistical or bioinformatics annotation for rare variant association with quantitative trait.

نویسندگان

  • Andrea E Byrnes
  • Michael C Wu
  • Fred A Wright
  • Mingyao Li
  • Yun Li
چکیده

In the past few years, a plethora of methods for rare variant association with phenotype have been proposed. These methods aggregate information from multiple rare variants across genomic region(s), but there is little consensus as to which method is most effective. The weighting scheme adopted when aggregating information across variants is one of the primary determinants of effectiveness. Here we present a systematic evaluation of multiple weighting schemes through a series of simulations intended to mimic large sequencing studies of a quantitative trait. We evaluate existing phenotype-independent and phenotype-dependent methods, as well as weights estimated by penalized regression approaches including Lasso, Elastic Net, and SCAD. We find that the difference in power between phenotype-dependent schemes is negligible when high-quality functional annotations are available. When functional annotations are unavailable or incomplete, all methods suffer from power loss; however, the variable selection methods outperform the others at the cost of increased computational time. Therefore, in the absence of good annotation, we recommend variable selection methods (which can be viewed as "statistical annotation") on top of regions implicated by a phenotype-independent weighting scheme. Further, once a region is implicated, variable selection can help to identify potential causal single nucleotide polymorphisms for biological validation. These findings are supported by an analysis of a high coverage targeted sequencing study of 1,898 individuals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

geneAttribution: trait agnostic identification of candidate genes associated with noncoding variation

Motivation We have developed geneAttribution, an R package that assigns candidate causal gene(s) to a risk variant identified by a genetic association study such as a GWAS. The method combines user-supplied functional annotation such as expression quantitative trait loci (eQTL) or Hi-C genome conformation data and reports the most likely candidate genes. In the absence of annotation data, geneA...

متن کامل

SimRare: a program to generate and analyze sequence-based data for association studies of quantitative and qualitative traits

MOTIVATION Currently, there is great interest in detecting complex trait rare variant associations using next-generation sequence data. On a monthly basis, new rare variant association methods are published. It is difficult to evaluate these methods because there is no standard to generate data and often comparisons are biased. In order to fairly compare rare variant association methods, it is ...

متن کامل

Using genomic annotations increases statistical power to detect eGenes

MOTIVATION Expression quantitative trait loci (eQTLs) are genetic variants that affect gene expression. In eQTL studies, one important task is to find eGenes or genes whose expressions are associated with at least one eQTL. The standard statistical method to determine whether a gene is an eGene requires association testing at all nearby variants and the permutation test to correct for multiple ...

متن کامل

Phenotyping, association analysis and annotation of genes related to leaf wilting of bread wheat (Triticum aestivum L.) at the seedling stage under drought stress conditions

Rapid screening of plant germplasm in the early stages of growth and determining the genetic basis of wheat leaf wilting index at the seedling stage is necessary for wheat breeding programs. In the present research, leaf wilting index for 290 Iranian bread wheat genotypes, including; 90 cultivars and 200 landraces were studied under drought stress conditions at the seedling stage in 2021 in res...

متن کامل

Improved methods for multi-trait fine mapping of pleiotropic risk loci

MOTIVATION Genome-wide association studies (GWAS) have identified thousands of regions in the genome that contain genetic variants that increase risk for complex traits and diseases. However, the variants uncovered in GWAS are typically not biologically causal, but rather, correlated to the true causal variant through linkage disequilibrium (LD). To discern the true causal variant(s), a variety...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genetic epidemiology

دوره 37 7  شماره 

صفحات  -

تاریخ انتشار 2013